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Abstract 

A theorem of L. Caffarclli implies the existence of a map, pushing forward a source 
Gaussian measure to a target measure which is more log-concave than the source one, 
which contracts Euclidean distance (in fact, Caffarclli showed that the optimal-transport 
Bremer map T opt is a contraction in this case). We generalize this result to more general 
source and target measures, using a condition on the third derivative of the potential, by 
providing two different proofs. The first uses a map T, whose inverse is constructed as 
a flow along an advection field associated to an appropriate heat-diffusion process. The 
contraction property is then reduced to showing that log-concavity is preserved along the 
corresponding diffusion semi-group, by using a maximum principle for parabolic PDE. 
In particular, Caffarelli's original result immediately follows by using the Ornstcin- 
Uhlcnbcck process and the Prekopa-Leindler Theorem. The second uses the map T opt 
by generalizing Caffarelli's argument, employing in addition further results of Caffarclli. 
As applications, we obtain new correlation and isoperimetric inequalities. 

1 Introduction 

The starting point of this work is the following "Contraction Theorem" of L. Caffarelli [14]: 

Theorem (Caffarelli). Let fj, = ex.p(—Q(x))dx and v = exp(— (Q(x) + V{x)))dx denote 
two Borel probability measures on Euclidean space (]R n ,|-|), where Q denotes a quadratic 
function, i.e. 

Q{x) = (Ax,x) + (b,x) +c , (1.1) 

with A positive- definite, and V is a convex function. Then the Brenier optimal-transport 
map T = T op t pushing forward [i onto v is a contraction: 

Vx,yeR n \T(x) -T(y)\ <\x -y\ . 
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Let us recall some of the notions used above. A Borel map T is said to push-forward 
\i onto v, denoted T*(/i) = u, if u(A) = /x(T~ 1 (A)) for any Borel set A. Among all 
such maps T, it is natural to minimize the squared-distance transport cost: Wf(yu, z^) '■= 
mi Tt: ^ =u f \T(x) — x\ 2 d^{x) - this is precisely the Monge (or Monge-Kantorovich) problem 
for a quadratic cost. The Brenier map T op t : R n — > R n pushing forward [i onto v is the /U-a.e. 
unique map for which the latter infimum is attained; it is precisely characterized by the 
property of being the gradient of a convex function tp : R n — > R, as first proved by Y. Brenier 
[10]. It is known that the optimal-transport distance W2 metrizes the Wasserstein space 
W2(R n ) of square integrable Borel probability measures on R n equipped with a suitable 
weak topology. We refer to [53, 54] for a comprehensive account on this and related topics. 

1.1 Main Result 

Fix an orthogonal decomposition of (R n , \ • |) into subspaces {Ei}^ =0 . 

Definition. We will say that a function F : R n — > R satisfies our symmetry assumptions 
if it is invariant under the action of the subgroup 0(E\, . . . ,E/.) := 1 x 0(E±) x . . . 0(E/ t ) 
of the orthogonal group 0(n), or equivalently, if: 

3 $ : ^dimE +k _^ R SQ that F ( x j = §(p ro j EoX ^ \p ro j El x\,. . . , \Proj Ek x\) . (1.2) 

We will similarly say that a map T : R n — > R n satisfies our symmetry assumptions if it 
commutes with the action of the latter subgroup. 

Our main result generalizes Caffarelli's Theorem as follows: 

Theorem 1.1. Let [i = exp(—U(x))dx and v = exp(—(U(x) + V(x)))dx, denote two Borel 
probability measures on Euclidean space (R n , |-|). Assume that U € C^'°(R n ) (a > 0) is a 
convex function of the form: 

k 

U{x) = Q(Proj Eo x) + p l {\Proj El x\) , Vi = 1 . . . k p'{' < on R+ , (1.3) 

i=l 

where Q : Eq —> R is a quadratic function as in (1.1), and that V : R n — > R is convex and 
satisfies our symmetry assumptions (1-2). Then there exists a map T : R™ — > R n pushing 
forward fx onto v and satisfying our symmetry assumptions which is a contraction. 

Remark. The smoothness assumption on U above is immaterial, and may be dispensed 
if fi is approximated (say, in total- variation distance) by measures which satisfy the 
conditions of Theorem 1.1 (see Lemma 3.3). A prototypical example where this applies is 
for the functions Pi(x) = \x\ Pi , pi £ [1, 2]. The same comment applies for V, so it is enough 
to prove the theorem for smooth U, V, and conclude by a compactness argument detailed 
in Section 3. 

The general formulation of Theorem 1.1 interpolates between the following extremal 
cases: 
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(jl is a product measure and V is "unconditional": 

n 

U (%) = Pi(\ x i I) with Pi' < and V(xi, . . . ,x n ) = V(±x\, . . . , ±x n ) are convex 



8=1 



• [7 and V are both radial: 



U(x) = p(\x\) with p w < and V(x) = &{\x\) are convex . 



• U is quadratic and V is an arbitrary convex function. 

We shall be mainly interested in the first case, since the third one follows immediately 
from Caffarelli's result, and the second one may be easily obtained using a one dimensional 
argument reproducing Caffarelli's original proof, as described in Section 5. However, for 
some of the applications presented in this work, the case when < dimE® < n is the most 
interesting. We also remark that Caffarelli's theorem has recently been generalized in other 
directions by Valdimarsson [52] and Kolesnikov [35]. 



1.2 The Construction 

As opposed to the non-constructive optimal-transport map T op t, our map T is obtained as 
a limit of diffeomorphisms {Tf } t>0 , constructed as a (reverse) flow along an advection field 
generated by an appropriate heat diffusion process. Let L denote the following second-order 
differential operator: 

L = exp(U) V • (exp(-C/)V) = A - (V, W) , (1.4) 

and let Pf := exp(tL) : Loo(R") — > L OQ (W l ) denote the associated diffusion semi-group, 
characterized as solving the parabolic equation: 

jjPtU) = L i p tU)) . PqU) = f ( for smooth bounded functions /) . (1.5) 

The latter is simply the usual heat-equation with an additional first-order drift term, also 
known as the (linear) Fokker-Planck equation. Its invariant measure is easily checked to be 
H = exp(—U(x))dx: 

J L{f)gdn = - j (Vf,Vg) dp = f fL(g)dp , j Pf(f)gdp = f fP[ J (g)dp . (1.6) 

In particular, —L becomes a self-adjoint positive semi-definite operator on an appropriate 
dense subspace of 1^2 (a*)- Since [i is a log-concave probability measure, it is known that 
— L has a non-trivial spectral- gap, from which it follows by the Spectral Theorem that 
P t (/) — H-+oo f fdf-b in a rather strong sense (see Section 3). Defining: 

v t := P t u (e X p(-V))P , (1-7) 

it follows in particular that vq = v and ut ~^t^oo P, so {ut} naturally interpolate between 
v and fx. We will show how to construct diffeomorphisms {Ti} t>0 , so that each Tt is a 
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contraction satisfying our symmetry assumptions which pushes forward v% onto v. Theorem 
1.1 then follows by a compactness argument, ensuring that \Tt\ converge appropriately to 
our desired map T. 

Our construction is in fact for the inverse-maps St ■= Tj~ , pushing forward v onto v%. 
These diffeomorphisms are constructed as a flow along a (time-dependent) advection field 
Wt induced by our diffusion: 

j t S t {x) = W t (S t (x)) , So = Id . (1.8) 
To choose a consistent Wt, we use the well-known Continuity Equation (see e.g. [53]): 

j Vt + V ■ {ytWt) = , 

which allows us to pass from the Lagrangian view point (1.8) to an Eulerian one. We 
conclude using (1.7) that: 

±pU(eM-V)) = -exp(U) V • (exp(-U)P t u ' (exp(-V))W t ) , 

and to make this consistent with (1.4) and (1.5), we choose: 

W t := -VlogPf (exp(-F)) . (1.9) 

It remains to show that the maps St are expansions, i.e. \St(x) — St(y)\ > \x — y\. Being 
diffeomorphisms, this is equivalent to requiring that the maps are expansions locally: 

(DS t )*DS t > Id . 

Differentiating this inequality in t and using (1.8), we see that it suffices to show that 
DW t + (DW t )* > for all t > 0. By (1.9), this is equivalent to showing that: 

- J D 2 logP t f/ (exp(-l/)) > Vt > . 

1.3 The Reduction 

This is formulated in the following result, which we believe is of independent interest: 

Theorem 1.2. Under the assumptions of Theorem 1.1, P] 1 preserves the log- concavity of 
exp(— V). In other words, — log i-*/ / (exp(— V)) is a convex function for all t > 0. 

It should be noted that by a result of A. Kolesnikov [34] (see also [44], and compare with 
[28]), the only smooth linear diffusion processes (1.5) with generator L = A(x)V 2 + 6(x)V 
which preserve the log-concavity of exp(— V) for arbitrary convex functions V, are precisely 
the Ornstein-Uhlenbeck processes, given by a constant valued matrix A and an affine map 
b (for our generator (1.4), this corresponds to quadratic potentials U = Q). That the 
Ornstein-Uhlenbeck processes preserve log-concavity is well known, and may be easily seen 
using the Mehler formula and the Prekopa-Leindler Theorem (e.g. [25]); together with our 
construction above, this already provides an alternative proof of Caffarelli's Contraction 
Theorem (with some other map T). By restricting to convex functions V having certain 
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symmetries, as in Theorem 1.2, we are able to show that log-concavity is preserved for 
generators with more general potentials U. 

The proof of Theorem 1.2 is based on parabolic PDE methods and in particular the 
maximum principle (see [31, 41, 24] and the references therein). Let us give a very heuristic 
outline of the proof. After assuming that V is smooth enough and strictly convex, and 
restricting the problem onto a smooth, bounded and strictly convex domain by imposing zero 
Dirichlet boundary conditions, we proceed in the contrapositive. Assume that V = V(x, t) 
does not remain strictly convex, and argue that there will be a first time to > when this 
fails; this step is the most delicate in all of the proof and requires very careful justification, 
a point that has been omitted in many previous works on concavity properties of solutions 
to parabolic PDE. The strict convexity of the boundary guarantees that the minimum of 
Dg e V(x, to) will be attained in an interior point xo and some direction e. Since this will 
be a local minimum, this implies on one hand that (d/dt — A)(Dg e V)(xo, to) < 0. On the 
other hand, using that DD e V = and DD^ e V = at (xo,to), a calculation shows that: 

{(d/dt - A)(D 2 ete V))\ {xoM = - {D 3 U)\ xo (e,e,VV(xo,to)) ■ 

At time t = to, V(-,t) is still assumed to be convex, and our geometric structural and 
symmetry assumptions on U and V were precisely designed to guarantee that the latter 
expression be non-negative. Massaging this argument a little more, we obtain a contradic- 
tion, thereby concluding the proof. We emphasize again that key to our approach is an 
analysis at the very first time to > when things may go wrong - a triviality for the usual 
application of the maximum principle for a (uniformly continuous) function on a bounded 
parabolic domain, but a genuine issue when applied to its second derivatives, which may 
not be uniformly continuous up to the boundary. 

1.4 Applications 

Besides the applications provided in his original paper [14], Caffarelli's Contraction Theorem 
has found numerous applications in various fields, serving as a tool to transfer isoperimetric 
inequalities, obtaining correlation inequalities, and more (see e.g. [17, 18, 26, 33]). Most 
of these applications only use the fact that there exists some contracting map pushing 
forward one measure onto another, without employing the additional information that this 
map is the Brenier map, i.e. the gradient of a convex function. Consequently, it is a 
mere exercise to repeat the corresponding proofs in our more general setting, replacing 
Caffarelli's Theorem with Theorem 1.1, and thereby extending these applications. We will 
not go through all of these in this work, but rather indicate several selected applications 
pertaining to correlation inequalities, extending in particular some known results regarding 
the Gaussian Correlation Conjecture (described in Section 4) to our setup, following an 
argument of Dario Cordero-Erausquin [17]. We will also briefly indicate how to obtain new 
isoperimetric inequalities. 

1.5 Afterthoughts 

After understanding how to extend Caffarelli's Contraction Theorem using our heat-induced 
flow and proving Theorem 1.1, we revisited Caffarelli's original argument from [14], and 
observed: 
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Theorem 1.3. Theorem 1.1 is also valid when replacing T with the Brenier optimal- 
transport map T op t pushing forward fi onto v. 

For the proof of Theorem 1.3, which is based on Caffarelli's own proof, we require an 
additional ingredient from [14] in the form of Theorem 5.1, described in Section 5. Roughly 
speaking, Caffarelli's argument is oblivious to the quadratic part of U, and for the non- 
quadratic part on Eq, reduces under our assumptions the task of showing that T opt is a 
contraction, to showing that it is a contraction with respect to the origin. It is this latter 
property which is verified using Theorem 5.1. 

In Section 6, we compare between the two maps T (as constructed in Subsection 1.2) and 
T op t. It is not hard to verify that the path [0, oo) 3 t i— >■ St of our interpolating diffeomor- 
phisms does not coincide in general with the path [0, 1) 3 s \— >■ (1 — s)Id + sS op t of optimal 
interpolating maps, where S op t = T~ p \ denotes the Brenier map pushing forward v onto fi. 
Indeed, our diffusion process may be seen as the gradient flow for the entropy functional 
H(vt\n) on the Wasserstein space W2(W n ) equipped with an appropriate Riemannian struc- 
ture (Otto and Villani [49], see also Jordan-Kinderlehrer-Otto [29]); optimal-transport, on 
the other hand, corresponds to moving along the geodesic between v and \i in W-^^™)? ke. 
gradient flow for the distance squared functional W|(ft, A*)- Consequently, we believe that 
the limiting maps T and T op t are in general different, although we have not been able to 
exclude the possibility that they coincide. The assumptions of Theorem 1.1 were precisely 
designed to ensure that T contracts distances, but it is quite surprising that exactly the 
same assumptions imply (for seemingly different reasons!) the same for T op t. 

When comparing these two approaches, it is worth pointing out that our diffusion ap- 
proach only relies on classical regularity results for linear parabolic PDEs, whereas analyz- 
ing the optimal-transport map requires Caffarelli's deeper regularity results for the fully- 
nonlinear Monge- Ampere equation (see [12, 13] and the references therein); consequently, 
the former approach may lend itself to further generalization, in particular to setups where 
the latter regularity results for the Brenier-McCann optimal-transport map are unavailable, 
or alternatively, known to be false, as in the Riemannian-manifold setting (see [54]). 

1.6 Organization 

The rest of this work is organized as follows. In Section 2 we provide a complete proof of 
Theorem 1.2. In Section 3, we rigorously justify the proof of Theorem 1.1 described above, 
providing the (few) missing details in the above construction. In Section 4 we present some 
applications of Theorem 1.1. In Section 5, we revisit Caffarelli's argument and provide an 
alternative proof of Theorem 1.1 for the Brenier map T op t itself. Lastly, in Section 6, we 
compare between the two maps T and T op t, and conclude with some final remarks. 
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2 Proof of Theorem 1.2 

This section is dedicated to the proof of Theorem 1.2, from which Theorem 1.1 easily 
follows, as explained in the Introduction, and rigorously verified in Section 3. We begin by 
setting up the notation throughout the paper. Our basic reference is [37], even though our 
notation varies slightly from the notation used there. We will use D and V interchangeably 
to denote the derivative operator in M. n . Given an non-negative integer k, we denote by 
C fe (E) the space of real-valued functions on E C M n with continuous derivatives D a f, for 
every multi-index a of order \a\ < k, equipped with the usual maximum norm: 

WfWcHv := Yl s ^p\ Da f( x )\ ■ 

\a\<k x£T: 

Similarly, the space C fc+a (E) = C fc ' Q (E) denotes the subspace of functions whose k-th order 
derivatives are Holder continuous of order a € (0, 1], equipped with the norm: 

\D a f(x) - D a f(y)\ 



C fc < Q (E) - — \\J HC* fc (E) 



+ sup 



,x^/e£ \x-y 

\a\=k 



We will say that a continuous function is Holder continuous of order 0, in which case C fc (E) 
indeed coincides with C fc,0 (E). 

When E = Q x is a product domain consisting of space x £ f2 and time t € 
components, we will denote by C kxl (Q x 0) the space of real-valued functions / with 
continuous (in E) space derivatives D x of order \a\ < k and time derivatives Df of order 
s < Z, equipped with the norm: 

I 

11/11^(0x0) := E SU P 1^/(2)1 +X>PlA7(z)l . 

|a|<fe zeS s=0* eE 

We will also denote by C^I 2 \Sl x 0) the space of real-valued functions / on E such that 
for every integer r, s > with r + 2s < f3 and \a\ = r, D^Dff is Holder continuous in x of 
order min(/3 — (r + 2s), 1) and in t of order min(/3/2 — (r/2 + s), 1). The natural norm on 
this space is given by: 



C(^/2)(nxB) 



J2 ^2sup\D a x D s J(z)\ + 



r+2s<[/3J \a\=r z£T; 

+ 2 X , su p 



\D*Dlf( Xl ,t)-D a x D?f(x 2 ,t)\ 



r+2s=[_f3\ \a\=r 

sup 



\P-{r+2s) 



L/3J-l<r+2s<L/3J \a\=r : 



Lastly, we will denote by Wp 2l ' l \^l x 0) for p 6 [l,oo] and Z a non-negative integer, the 
space of functions / on 0, x so that for any integer r, s > with r + 2s < Z and \a\ = r, 
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the distributional derivatives D^Dff are in L p (Q x O) (this space is equipped with its usual 
Sobolev norm, which we will not require explicitly). 

Finally, we let p oc (£) denote the space of functions belonging to F(TV) for all compact 
subsets II of E. 

2.1 Reduction to smooth V 

Let us start by summarizing several well-known properties of the semi-group {P i c/ } t>0 . 

From the classical theory of parabolic equations, it follows that for each t > 0, P^ acts 
linearly on the space B(M. n ) of smooth bounded functions on M n to itself (indeed, there 
exists a unique solution of (1.5) in the class of bounded functions), and hence is a semi-group 
P t u oP!/ = P^ s . Moreover, by the maximum principle, it follows that ||-ff^(/)|| r 5; II /Hz, 
and that P? (f) > for any / > in B(R n ). Since / lf{f)dfj, = / fdfi, as easilychecked by 
differentiating in t and using (1.5), it follows by interpolation that ||P^ ; (/)||x, ^ < II/IIl p (^) 
for all p G [1, oo]. Consequently, the action of P t extends to all of the L p (p) spaces, 
clarifying the statement of Theorem 1.2. 

It follows immediately that it is enough to prove Theorem 1.2 for smooth functions V. 
Indeed, any convex function V : R n — > R U {+00} may be pointwise approximated from 
below by a non-decreasing sequence of smooth convex functions V m : R n — > R, which may 
be chosen to preserve any symmetry properties satisfied by V. In particular, exp(— V m ) 
tends to exp(— V) in Li(fi), and so V m + c m satisfy the assumptions of Theorem 1.2, where 
c m — > denote normalization constants ensuring that exp(— (V m + c m ))p are probability 
measures. Consequently Pf (exp(—V m )) tends to P/ / (exp(— V)) in Li(/i), and since the 
sequence P/ / (exp(— V m )) is pointwise non-increasing (using the positivity of -P/ 7 ), it follows 
that there exists a pointwise limit which coincides with P t C7 (exp(— V)) in L\{n). By assuming 
that Theorem 1.2 holds for smooth functions, it follows that P/ / (exp(— V m )) are log-concave: 

P t c/ (exp(-y m )) > Pi C7 (exp(-y m ))( 2 ;)ip f C7 (exp(-y m ))(y)l Vx, y G M. n , 

and this is clearly preserved under pointwise limit. The reduction to the case that V is 
smooth is complete. 

2.2 Reduction to vanishing Dirichlet boundary conditions 

Let P>(R) denote the open Euclidean ball in M n of radius R centered at the origin, and let 
X ■ [0, 1] — > [0, 1] denote a smooth log-concave (non-increasing) function so that x|[o,i) > 0> 
Xl [0,1/2] = 1 and x(l) = 0. 

Proposition 2.1. Let U G C^(R n ), V G Cf^(R n ) and exp(-V) G C°(M n ). Assume that 
for any R, T > 0, the solution fn(x,t) to the parabolic equation: 

j t f R = Af R - (Vf R ,VU) , f R (x,0) = exp(-V(x)) X (\x\/R) , (x,t) G B{R) x [0,T] , 
with vanishing Dirichlet boundary conditions: 

f\dB R x[0,T] = , 
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is spatially log-concave on B(R) for any t G [0,T]. Then the (unique) bounded solution 
f(x,t) to the Cauchy problem: 



-f = A/ - (V/, VU) , f(x,0) = exp(-y(x)) , (x,t) GM"x [0,oo) 



(2.1) 



is also spatially log-concave on M n for any t > 0. 

Proof. This follows from a standard argument, which we include for completeness. Fix 
T > 0; we will show that f(x,t) is log-concave on W 1 for any t G [0, T]. By the classical 
theory of parabolic PDEs (e.g. [37, Chapter IV, Theorem 10.1]), for any < r < r' < R, 
we have the following (spatial) interior Schauder-type estimate: 

ll/R|lc( 2 + Q ^+ Q / 2 )(B(r)x[0,T]) - Cl ||/fl(-,0)|| C 2+a( B ( 7 ./)) +C 2 \\fR\\c°(B(r')x[0,T]) ' 

where the constants C\, C2 > above depend only on n, T, || VU\\Qo,am<r')) > r > r '' a - By the 
maximum principle, ||/fl|lc0(£(r-')x[0T]) < ll ex P( — ^0llc°(R n ) < 00 • ^nd ^ we assume that 
R > 1, since x is smooth it follows that ||/i?(-, 0)|| C 2, Q ( B ( r /)) < C 3 ||exp(— VJH^.a/B^m < 00 
for some constant C3 > 0. We conclude that: 

Vr > 0, 3C r > such that Vi? > r + 1, ||/R|lc*( 2 + a i 1 +"/ 2 )(_B(r)x[o,T]) < • 

It follows by Arzela-Ascoli compactness that given r > 0, we may extract a sequence of 
Rm > r + 1 increasing to infinity, so that fji m converges in C 2xl (B(r) x [0, T}). Ap- 
plying a standard diagonalization argument, we conclude that there exists a sequence 
{Rk} increasing to infinity, so that fn k converges in C^* (R n x [0,T]) to some foo G 

C loc a ' 1+a/2) ^ X [°' T ]) ( Which 

is in addition clearly bounded). It follows that /oo satisfies 
(2.1) on R n x [0, T], so by the well-known uniqueness of this equation in the class of bounded 
functions, we deduce that = / on W 1 x [0, T]. But foo(-,t) is clearly log-concave for any 
t G [0,T], just from being the pointwise limit of the log-concave functions /^ fe (-,t). This 
concludes the proof. □ 

Let V e Cf£(R n ) satisfy the assumptions of Theorem 1.2. If we define Vr G C 4 < a (B(R)) 
by setting exp(— Vr) = exp(— V (x))x(\x\ / R) on B(R), we note that the symmetry assump- 
tions of Theorem 1.2 remain in tact for Vr on B(R). By Subsection 2.1 and Proposition 
2.1, Theorem 1.2 consequently reduces to the following: 

Theorem 2.2. Let U be as in Theorem 1.1 and let /o G C 4,a (B(R)) denote a positive 
function on B(R) vanishing on dB(R). Assume that on B{R), /o = exp(— Vo), with Vo 
convex and satisfying our symmetry assumptions (1.2). Then for every T > 0, the unique 
solution f to the following parabolic equation on B(R) x [0,T]; 



is spatially log-concave, i.e. f = exp(— V) with V(-,t) convex on B(R) for every t G [0, T]. 

This reduction step is similar to the one in [19], referenced to us by Cedric Villani, whom 
we would like to thank. 



d_ 

dl 



f = Af- (Vf,VU) , f\ t=0 = f , f\ dBR x[ ,T] = 



(2.2) 
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2.3 Log-Concavity away from the boundary 

We proceed to provide a proof of Theorem 2.2, modulo some very delicate details which 
are postponed to the next subsection. As in many previous works on concavity /convexity 
properties of solutions to elliptic and parabolic PDEs ([42, 36, 32, 15, 31, 20, 41]), our 
approach is based on the maximum principle for the second derivative (or its finite difference 
analogue); other approaches may be found e.g. in [9, 7, 23, 1, 16, 6] and the references 
therein, or in the classical book by B. Kawohl [31]. We clarify some of the difficulties which 
arise in showing log-concavity in the parabolic case and which were omitted in some of 
these previous works. Another challenge we encounter, is that the condition our parabolic 
equation must satisfy, so that we can deduce the log-concavity of the solution, in fact 
assumes that the solution is already log-concave. Hence, arguing in the contrapositive, we 
must perform our analysis at precisely the first time when things go wrong, which again 
requires some delicate justification. To this end, we avoid using the usual convexity function, 
introduced by Korevaar [36] and employed by many others (see the previously mentioned 
references or [31, 41, 24] and the references therein), and work directly with the second 
derivatives. 

Proof of Theorem 2.2. By approximating /o appropriately and arguing as in Subsection 2.1, 
we may assume that: 

min |V/o|(x)>0; (2.3) 

the only difference is that now, due to the boundary conditions, H/("i^)II.Li(/i| B f H )) wm n °t 
be preserved, but rather decrease, with time. See also [24, Lemma 6.1], where a similar 
preliminary step was employed. 

Fix T > 0. Since /o G C 4,a (B(R)) and in addition every component of VU is in 
C 2,a (B(R)), it follows from the classical Schauder theory of parabolic PDEs (e.g. [37, 
Chapter IV,Theorem 10.1]) that / G C^ a ' 2+a/2) {B(R) x [0,T]) (i.e. / G c^ +a > 2+a ' 2 \K x 
[0,T]) for every compact subset K C B(R)), and also that / G C^ A+a ' 2+a ^ (bJr) x [e,T]), 
for any < e < T. A crucial point to note is that the latter smoothness of the solution 
does not extend all the way to the entire boundary dB(R) x [0, T], since our assumption 
(2.3) contradicts (in general) the compatibility which is usually assumed between the spatial 
derivatives of /o and the time derivatives of our Dirichlet conditions (see Subsection 2.4). 
This difficulty seems unavoidable using this approach, and addressing it requires careful 
justification of subsequent steps, something which has been omitted in previous works. 

It also follows from the strong maximum principle (and our initial conditions) that / > 
on B(R) x [0,T], and hence V G C^ a ' ,2+a/2) (B(R) x [0,T]). One immediately checks that 
V satisfies the following non-linear parabolic PDE on B(R) x [0, T\: 

jV = AV - (W, VU) - (W, W) . 

Let e > and define V G C^ a ' 2+a/2) (B{R) x [0,T]) as: 

V(x,t) := V(x,t)+e/3(t)^Y > 
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where f3 : [0, T] — > M + denotes a suitable strictly positive smooth function to be determined 
later on. We claim that for all small enough e > 0, V(-,t) must remain strictly convex for 
all t 6 [0, T], and taking the limit as e — > 0, we will conclude that V(-, t) is itself convex, as 
required. 

Assume in the contrapositive that this is not so. Let to £ [0, T] denote the inn- 
mum over all times t when V(-,t) is not strictly convex, so that there exists a sequence 
(x m ,t m ,e m ) £ B(R) x (0,T] x S^ 1 converging to (x ,t ,e) G B(R) x [0,T] x S n ~ l and 
satisfying D 2 m ern V(x m ,t m ) < (here 5 n_1 denotes the unit sphere in W 1 , identified with 
the unit sphere in the tangent spaces T Xm M. n ). 

The most delicate part of the proof will be presented in Proposition 2.6 in the next 
subsection, where it will be shown that some further regularity estimates of / up to the 
boundary, together with (2.3) and the strict convexity of dB(R), imply that necessarily 
xq ^ dB(R). It follows by continuity of the second derivative of V in B(R) x [0, T] and the 
minimality of to that D 2 e V(xo,to) = 0, and therefore to > (since at time t = 0, V(-,t) is 
clearly strictly convex). Moreover, xo £ B(R) is a local minimum point, and hence: 

DDl e V(x , t ) = , ADl e V(x , t ) > , jR>\ e V(x , to) < , (2.4) 

where D denotes the space derivative. Since is the minimum value for the function 
e — > B 2 e e V(xo,to), it follows that it must be an eigenvalue of D 2 V(xo,to), and that e is a 
corresponding eigenvector: 

DD e V(x ,t ) = D 2 V(x ,t )e = , and hence DD e V(x ,t ) = -ef3(t Q )e . (2.5) 

Using (2.4), we must have at (xo,to): 

(d/dt - A)(Dl e V) < . (2.6) 

We will show that under our assumptions on U and the definition of to, the latter value must 
be strictly positive, obtaining the desired contradiction and concluding the proof. Indeed, 
at a general point (x, t): 

(d/dt - A)(D 2 ete V) = Dl e ((d/dt - A)(V)) 
= Dl e (eP'(t)\x\ 2 /2 - ne/3(t) - (VV, W) - (VV, W» = e/3'(t) - (DD 2 e V, DU) 
-2 {DD e V, DD e U) - (DV, DD 2 e>e U) - 2 (DD^V, DV) - 2 (DD e V, DD e V) . 

At (xo,to) 5 using (2.4) and (2.5), we see that: 

(d/dt - A)(Dl e V)(xoM = ef3'(t ) + 2eP(t )D 2 e e U - 2e 2 /3(t ) 2 - {DV, DD 2 e U) 
= e/3'(t ) + 2e[J(t )Dl e U - 2e 2 p(t ) 2 + e/3(t ) (x, DD%U) - (DV, DD 2 e>e U) 
> ef3'(t ) - (2eP(t )M 2 + 2e 2 /3(t ) 2 + e(3(t )RM 3 ) - D 3 U(e, e, DV) , 

where M 2 := sup a . eB(H))fe5 n-i D 2 ^U(x) and M 3 := s\ip xeB ^ eS n-i \(D 3 U)\ X ({,,£, |f|)|. 

Note that by the definitions of to and xo, D 2 ^V(x,to) > D 2 e V(xQ,to) = 0, so V(-,t) 
is still convex on B(R) at time t = to- Also note that since U, fo (and B(R)) are all 
invariant under the action of 0(E±, . . . , Ek), and since the Laplace operator commutes with 
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the entire orthogonal group, it follows easily that / o G is also a solution to (2.2) for any 
G G 0(E\, . . . ,Ek). The uniqueness of the solution implies that f(-,t) (and hence V(-,t) 
and V(-,t)) are also invariant under the action of this subgroup, and hence satisfy our 
symmetry assumptions for all t > 0. We will see in Proposition 2.3 below that for any 
convex function F : R n —> R satisfying our symmetry assumptions, the condition on U 
implies that (D 3 U)\ x (g, £, DF(x)) < for any and £ G 5 n_1 . Therefore, in order to 

arrive to a contradiction with (2.6), it is enough to show that for small enough e > and 
an appropriate choice of /3, we have: 

/3'(f ) - (2/3(i )M 2 + 2e/3(i ) 2 + P(t )RM 3 ) > . 

Indeed, this is satisfied on [0,T] by setting (3(t) := exp((2M 2 + RM 3 + l)t) and letting e < 
l/(2/?(T)). This completes the contradiction and concludes the proof, modulo Propositions 
2.3 and 2.6 below. □ 

We conclude this subsection with the proof of the following proposition, which is the 
only place where we use our structural assumptions on U and V. In fact, the assumption 
that U is convex may be omitted in all instances below (see Section 6 for more on this). 

Proposition 2.3. IfU and V satisfy the assumptions of Theorem 1.1 then: 

(D 3 U)\ x (£,£,VV(x)) <0 Vie R n V£ G s n-1 . 

The proposition follows immediately from the following two lemmata, which we formu- 
late separately for later use: 

Lemma 2.4. Let U satisfy the assumptions of Theorem 1.1. Then (D 3 U)\ X (^,^,9) < 0, 
for any x G ~R n , £ G S™" 1 and 6 G S 1 ™" 1 such that: 

\/i = 1, . . . , k 3<2j > so that ProjEfi = aiProjEiX . (2-7) 

Lemma 2.5. Let V satisfy the assumptions of Theorem 1.1. Then for any x G W 1 , 9 = 
VV(x) satisfies (2.7). 

Proof of Lemma 2.4- Let Qi : Ei — > M. be given by Qi(x) = pi(\x\), i = 1, . . . , k. Taking the 
third derivative of U, the quadratic term in (1.3) disappears and we are left with: 

k 

(D 3 U)\ x (£,£,0) = £ {D% 6l )\ Pro]E x (Proj E ^Proj E ^,Proj Ei 9) . 
i=i 

Let us show that each summand is non-positive. Denote: 

Xi := Proj Ei x , £i := Proj Ei £ , £[ := Proj Xi & , £• := Proj x x& , 0, := Proj E £ . 

If = then 0j = and hence the i-th summand is also 0, so we may assume that x% ^ 0. 
Using (2.7), an elementary calculation yields: 

(D%Qi)\ Xi te,£i,0O = U'dxilMl 2 + (f/f(\xi\) - 0*1^1 . 

Since t h-> /Oj(|t|) is a C 3 function, we see that p[(0) = 0. Since p'-' < on R + , meaning that 
p\ is concave there, we deduce that also p"(t) < (p'i(t) — P'(0))/t = p'i(t)/t for all t > 0. This 
implies that the term in brackets on the rand-hand side above is non-positive, and since 
a, > 0, the entire expression is non-positive as well, as claimed. □ 
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Proof of Lemma 2.5. Denote as usual Xj = Proj^x, i = 0,1,..., fc. Let us verify (2.7) 
for each i = 1, . . . , k. It is easy to see that the symmetries of V ensure that Di~V{x) := 
ProjE^V{x) lies in the one-dimensional subspace spanned by X{. Hence if Xi = 0, then 
DiV(x) = and (2.7) is satisfied trivially for that i, so we may assume otherwise. Denoting: 

DiV(x) =: DlV(x)^- , 

it remains to verify that D^V(x) > when xi ^ 0. The symmetries of V together with its 
convexity together imply that the following (convex) slice of V's sub-level set at x: 

A(x) := [z G Efr\ V(x + z)< V(x)} , 

contains the product set Be l (\xi\) x ... x BE k (\xk\), where Be^t) denotes the Euclidean 
ball of radius r in Ei . Geometrically, this means that the latter product set lies entirely on 
one side of the tangent plane to A(x) at Proj E ±x, or more precisely, that: 

(Proj E ±W(x), R{x) - xj> < \/R G 0(E 1 , ...,E k ) . 

Recalling that Proj E ±W(x) = Yli=i D?V(x)xi and choosing Ri G 0(E\, . . . , E^) to be 
the reflection in E^, defined by Ri{x) = x — 2xi, we conclude that: 

DlV{x)\xi\ 2 > V* = l,...,k . 

Since we assumed that X\ / 0, it follows that D\V{x) > 0, as required. □ 

2.4 Log-Concavity near the boundary 

To complete the proof of Theorem 2.2, we must show that xq £ dB{R). Recalling the 
definition of xq, this clearly follows from: 

Proposition 2.6. D 2 V(x,t) > in a neighborhood of dB(R) x [0, T]. 

The proof of Proposition 2.6 will be given at the end of this section, but first we explain 
the subtle regularity issue one is required to address here. Recall that the classical theory 
guarantees that under the assumptions of Theorem 2.2, / G Ci^ a,2+a ^ 2 \B{R) x [0, T]) 
(i.e. / G (7(4+"; 2+a/2) (jf x j 0) T]) for every compact subset K C B(R)), and also that 
f e £<(4+a;2+a/2) (j^ftj x [e,T]), for any < e < T. However, the latter smoothness does 
not extend all the way to the "corner" dB{R) x {0}, since in general we cannot guarantee 
the necessary and sufficient compatibility conditions: 

ViMdBiR) =0 * = 1,2 (2.8) 

(here L l denotes the iterated application of the operator L) . This prevents a straightforward 
application of standard arguments for deducing Proposition 2.6, and so we consequently 
need to obtain some delicate regularity estimates up to the boundary for the solution / to 
(2.2), which are given in Proposition 2.7 below. 
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To outline the proof and properly motivate Proposition 2.7, observe that: 

r 

Using Hopf's maximum principle and continuity of V/ (see Proposition 2.7 (1)) we see 
below that V/ is bounded uniformly away from zero near dB(R) x [0,T]. Therefore, the 
term V/® V/ is uniformly positive definite when restricted to the normal direction (relative 
to dB(R)). In addition, the gradient bound implies that / decays linearly to near the 
boundary, and one can show that —fD 2 f decays uniformly to zero near dB(R) x [0, T] (see 
Proposition 2.7 (2)). It follows that D 2 V restricted to the normal direction is uniformly 
positive definite near dB(R) x [0, T]. On the other hand, since dB(R) is the zero level 
set of /, the uniform convexity of dB(R) and the uniform lower bound on |V/| together 
imply that —D 2 f restricted to the tangential directions is uniformly positive definite along 
dB{R) x [0, T]. Since the tangential second derivatives of / are uniformly continuous (see 
Proposition 2.7 (3)), it follows that D 2 V restricted to the tangential directions is uniformly 
positive definite in a neighborhood of dB(R) x [0,T]. Mixed derivatives are controlled 
similarly. 

We now proceed with providing the precise details. We begin with: 
Proposition 2.7. Under the assumptions of Theorem 2.2: 

1. f G C^+^+^iBjR) x [0,T]) for all (3 G (0, 1). 

2. For any e > there exists a C £ > so that for any A G (0,-R): 

su p !!/(•> *)IIc 2( bP^) < § ■ ( 2 - 9 ) 



3. If n > 2, the spatial derivatives of f in the non-radial directions are C • • (B(R)) 
uniformly in t G [0, T\, for any 5 G (0,1). In other words, for any 5 G (0,1) there 
exists a finite constant C$ > 0, so that for any smooth unit vector field £ on B(R) 
such that (£(x),x) = 0: 

sup \\(Vf(-,t),£(-))\\ cUS(Wm) < C s ; 
te[o,T] y K >' 

(in fact, we actually have || (V/, ^) [lc , (i+5 ; (i+*)/2)(s^R)x[o,T']) 

<C 5 ). 

Remark 2.8. We were informed by Ki-Ahm Lee and Vladimir Maz'ya that it should 
actually be true that: 

sup \\f(-,t)\\ cl ,i (Wm) < oo , 
te[o,T] y \ /> 

but we will not insist on this here since the easier weaker estimate (2.9) suffices for our 
purposes. 

Proof of Proposition 2.7. 1. The first assertion follows from standard regularity theory. 
Even if the compatibility conditions (2.8) do not necessarily hold, it follows by [45, Theorem 
5.1.11 (ii)] that when f G C l ^{B(R)) for some /? G (0, 1) and f \ dB (R) = 0, then: 

/ G C {1+ P' (1+l3)/2) (BjR) x [0, T}) . (2.10) 
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Alternatively, one may employ the Sobolev regularity theory for parabolic PDEs (e.g. [37, 

(2-1) 

Chapter IV, Theorem 9.1 and subsequent Corollary]), which ensures that / G Wf ' ; (23(22) x 
[0,T]) for all p G (l,oo). Consequently, (2.10) follows by a variant of Morrey's embedding 
theorem (e.g. [37, Chapter II, Lemma 3.3]). 

2. This may be deduced from [43, Theorem 5.15] by considering weighted Holder spaces. 
To avoid these, one may proceed as follows. Applying a standard Schauder-type interior 
estimate, if /o G C 2,J (B(R)) and each component of VU is in C°' 7 (23(22)), one checks (see 
e.g. [37, p. 355]) that: 

VAg(0,22). (2.11) 



C -(2+ 7 ;1 + 7 /2)( jB ( R _ A)x [ 0jT ]) ^ ^ 2+7 

Combining (2.10) and (2.11), we deduce under the assumptions of Theorem 2.2, that for all 
A G (0,12): 

sup \\f(;t)\\c^0W^)) - B ? V/3 G (O' 1 ) ; 

t€[0,T] V V " 

c 

Sup ll/(-»*)ll C 2,7(Bp=Ay) < J2T^ v 7 G (0,1) . 

Since dB(R — A) is uniformly smooth for all A G (0,12/2), one can use interpolation in 
the spaces of Holder differentiable functions (see Lunardi [45, Corollary 1.2.19,1.2.7]), and 
obtain for any r] G (0, 7) and A in this range: 

1=2 (2 +7 )(i-/3+^) 

SUp ||/(-,t)llc3. v,Wr=X) ) <A 2+ ^2 + r 1 ,l-pB; +1 - f) C^\ i+^P . 

te[o,T] v v " 

By modifying the constants above, the bound remains valid for all A G (0,12). Choosing 
77 > and 1 — (3 > very small, the second part of Proposition 2.7 follows. 

3. This part is obtained by first flattening the boundary dB(R) near a point, and then 
applying the standard parabolic regularity theory to the resulting PDE for D T f, where r 
denotes a vector parallel to the flattened boundary. This procedure is standard, and the 
details are provided for the reader's convenience. 

Let us fix an orthogonal basis e\, . . . , e n of (M n , I • |) and a direction £ G S" 1 " 1 . Let 
T : B(R) — > Q denote a smooth diffeomorphism so that T coincides with the usual 
Cartesian-to-polar change of coordinates on the half-annulus A+ := B(R) \ 23(12/2)) D 
{x G R n ; (x,£o) > 0}. Now consider the PDE satisfied by g = f o T" 1 on n. Since both 
T and T _1 are smooth and in particular Lipschitz, it is easy to check that g satisfies a 
uniformly parabolic PDE on £1 x [0, T] of the form: 

j f 9 = J2 a ^ D i,j9 + E h ^9 , ( 2 - 12 ) 

i,3 i 

where a^j = aij(y) is a uniformly positive-definite smooth matrix and hi = bi(y) have the 
same smoothness as V£/, i.e. G C 2,a (Q). Moreover, since in polar-coordinates: 

A = r~ n+1 

dr dr r 2 
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we see that on T(A+), if we use the natural basis y = (9i, . . . ,9 n -i,r) to write (2.12), we 
actually have: 

Si.j i = n 

1 n — 1 



w)= k ; , . (2.i3) 



Finally, since T is a diffeomorphism, T{dB(R)) = d£l, and hence the boundary conditions 
are given by: 

g\t=o = 9o ■= fo j <?Unx[o,T] = • 

The usual regularity theory ensures that g G Cj^ a ' 2+a/2) (n x [0,T]), and as in the first 
part, it follows that: 
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e C (1+<5;(1+5)/2) (n x [0, T]) V5 G (0, 1) . (2.14) 



Now take the spatial derivative of (2.12) in a direction r G span(ei, . . . , e n _i). Denoting 
g T := D T g, we obtain that in Q x [0,T]: 

= ^2 + ^2 D r a i,j D i,j9 + ^2 b * D i9T + ^2 D r b i D i9 ■ 

i,j i,j i i 

By (2.14), the fourth term on the right hand side, which we denote by h, is in C(fi x [0, T]) 
(and in fact better). The second term above contains mixed second derivatives of g, but 
fortunately in T(A+), the matrix di t j(y) is given by (2.13), and hence D T aij(y) = 0. We 
conclude that in T(A+) x [0, T], g T satisfies the following uniformly parabolic PDE: 

^SV = ^2 a hAv) D tj9r + ^2 b i(y) D i9r + %, *) , (2-15) 

and that: 

g T \t=o = D T g , g T \(dT(A + )ndn)x[o,T] = . 

Employing the standard regularity theory, it follows as in the first part that g T G 
q(1+8;(1+S)/2)(q x [0, T]) for any 5 G (0, 1) and open subset 9 C ft with smooth boundary, 
which is in addition bounded away from dT(A+) \ dQ. Recalling that g = f o T~ l and that 
T is a polar change-of-coordinates on T(A + ), the third assertion of the proposition follows 
on (B(R£o,a) n B(R)) x [0, T] for some small enough a > (here B(z,a) denotes the ball 
of radius a centered at z). By following the bounds obtained in the proof, one may check 
that these do not depend on the choice of £o or the non-radial direction r. By compactness 
(or using the fact that actually a > does not depend on £o)> the assertion follows on a 
uniform neighborhood of dB(R) x [0,T], and the classical theory takes care of the interior 
regularity. This completes the proof. □ 



Proof of Proposition 2.6. Recall that by the classical theory, f(-,t) G C A,a (B(R)) for every 
t G [0, T\. The second fundamental form of a spatial level set M of / at a point (x, t) with 
V/(x,t) / 0, i.e. M = M Vjt ■= \z G B(R);f(z,t) = v\ where v = f(x,t), is given by: 



ii M (x)=-d y 



|V/| 



D 2 f ( u V/ n V/ 



T X M 



|v/| V |v/| |v/| 



D 2 f 

T X M = ' IV/I 



T X M 
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Since we assumed in (2.3) that |V/o| > on dB(R) and since V/ is (uniformly) continuous 
on B(R) x [0, T] by Proposition 2.7 (1), it follows that there exists some Tq > so that 
|V/| > d > on all of dB(R) x [0, To]. By the strong maximal principle and Hopf's 
lemma in the parabolic setting (see e.g. [21, Chapter 2, Theorem 14]), |V/| > c" > on 
all of dB(R) x [To,T], and by the uniform continuity of V/ we conclude that there exists 
R' G (0, R) and c, C > so that: 

0<c< -(vf(x,t), t||\ < |V/(x,i)| < C, V|as| G [R', R] Vie [0,7*], 
and hence: 

c{R - \x\) < f(x,t) < C(R- |x|), V|x| G [#,#] VtG[0,T]. (2.16) 

Since the level set M 0) t coincides with dB(R) for all t G [0,T] (/ > in x [0,T] by 

the strong maximum principle), it follows that: 



D 2 f 



|V/| 



^Id\ x± V(x,i) G dB(R) x [0,T] 



where x -1 is identified with T x dB(R). By Proposition 2.7 (3), the second spatial derivatives 
of / involving a non-radial direction are (uniformly) continuous on B(R) x [0, T], and so we 
deduce that there exists some R" G [R', R) so that: 

-T> 2 r /(x,i) > ^- and -D 2 „ /(a?,*) > -S V|x| G [i?", i2] Wg[0,T] VrGS^ni 1 . 

' 2ii '!«] 



:= max ■ 



D 2 T *f(x,t] 



oo . 



where: 

; x G B~{R), t G S^na^.i G [0,T]} < 

Since the tangential derivatives of / vanish on dB(R), it also follows that: 

\{Vf{x,t),T}\ < B(R-\x\) \/\x\ G [R", R] VtG[0,T] VrG^flx 1 . 
Lastly, fixing e G (0, 1), it follows by Proposition 2.7 (2) and (2.16) that: 

-/(x,i)T>f e /(x,i) > -CC £ {R- \x\) l ~ e V|x| G [R",R] V£ G S 71 ^ 1 ViG[0,T]. 
We are now ready to bound D 2 V, using: 

n , ~fD 2 f + Vf®Vf 

D V = -D 2 log / = -i r9 J . 

/ 

Given x with |x| G [i?",JR] and a direction £ G 5 n_1 , write £ = cos(#)r + sin(#)p, where 
p = x/\x\ and (r, p) = 0. For the purpose below, we can assume without loss of generality 
that 9 G [0, vr/2]. At the point (x,t), denoting in addition d = R — |x|, we have by all the 
estimates above: 

fD 2 ^V = cos 2 (e)(-/^ T / + (V/,r) 2 ) + sin 2 (0)(-/Tj2 p / + (V/,p) 2 ) 
+ 2 sin(0) cos(e)(-fD 2 J + (V/, r) (V/, p)) 

> cos 2 (9)cd^- + W{6){-CC e d l - £ + c 2 ) + 2cos(0) sin(0)(-CBd - CBd) . 
2R 
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We see that if d G [0, do] f° r some small enough do G (0, R — R"], we have for some 
p,q,r,p',q' > 0: 



f DltV > cos'(8)pd + sm^(9)q-2cos(8)sm(9)rd 

> cos 2 (0)|d + sin 2 (#) (q - ^-d\ > cos 2 (6)p'd + sin 2 (%' , 

and so when d G (0, do]: 

2 cos 2 (6)p'd + S m 2 (e)q' 

D U V * ^2 > - 

(indeed, this behaviour as a function of 9, d is the best one can expect). We conclude that 
D^V(x, t) > (and in fact, tends to +oo uniformly in d) for all \x\ G [R — d , R], t £ [0, T] 
and £ G S n ~ l . The proof is complete. □ 



3 Tying up loose ends 

In this section, we provide a complete justification of the proof of Theorem 1.1, described in 
the Introduction. We proceed with the same notations used there. The main technical points 
which we address in this section are showing that the flow map St is globally well-defined 
on R™ (see Lemma 3.1 and its preceding discussion), that the pushed- forward measure 
u t := {St)*v = P f c/ (exp(— V))fi converges to fi (see Lemma 3.2), and that the inverse map 
T t = S^ 1 converges (to a contracting map) as t — > oo (see Lemma 3.3) . 

Let U, V be as in Theorem 1.1. We assume further that V is sufficiently smooth (e.g. 
V G C 4 ' Q (lR n ) is more than enough), and that: 

||VV|| C ri,an R n) < oo and H-D 3 ^^ < oo . (3.1) 

We will see how to obtain the general case at the very end of this section. 

First, since exp (-V) G C A ' a (R n ) and U G Cf '"(R n ), the classical regularity theory 
of parabolic PDEs (e.g. [37]) ensures that f(x,t) := P i c/ (exp(-y))(a;), as the unique 
(bounded) solution to the Cauchy problem: 

j t f = Lf , f\ t=0 = exp(-F) , (3.2) 

^ s c/oc~ a ' 2+Q//2 ^ (R n x [0, oo)), and the strong maximum principle ensures that f(x,t) is 
strictly positive for all t G [0, oo). Consequently, the advection field Wt := —V log P t c/ (exp(— V)) 
18 Ci^ a '^ 3+a ^ 2 \w n x [0, oo)). In particular, the maps St defined by: 

j t S t (x) = W t {S t {x)) , S = Id, (3.3) 

are indeed locally well-defined as a solution to a flow along a locally Lipschitz vector field 
(e.g. [22, Proposition 1.56]): for any compact subset C C M. n , there exists t(C) > 0, so that 
(3.3) has a solution for any (x,t) G C x [0, t(C)). To ensure that the maps St are globally 
well-defined, it is enough to show that for any T > 0, Wt{x) is globally spatially Lipschitz 
for all t G [0,T], i.e. \DW t (x)\ < C(T) for all (x,t) G W 1 x [0,T]: 
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Lemma 3.1. Assuming (3.1), for all T > 0, D 2 log P i c/ (exp(— V)){x) is uniformly bounded 
in R n x [0,T]. 

Proof. We denote by abuse of notation V = V(x,t) = — log P/ / (exp(— V))(x) and Vt = 
V(-,t). Since D 2 V > by Theorem 1.2, it suffices to show a uniform bound on Z = AV. 
Recall from Section 2 that V satisfies: 

^-V = AV- (VV, VV) - (VV, W> , V\ t =o = V . (3.4) 
dt 

A direct calculation gives: 

j^Z =AZ - 2 (VZ, VV) - (VZ, Vt/) 

- 2tr((D 2 Vy D 2 V) - 2tr((D 2 V)*D 2 U) - (VAC/, VV) . 

Recalling that D 2 U > and D 2 V > 0, we conclude that: 

— Z < AZ - 2 (VZ, VV) - (VZ, Vt/) - (VAU, VV) . (3.5) 

To apply the maximum principle to (3.5), we need to control the zeroth order (right- 
most) term. To this end, we claim that: 

||VV|| Loo < ||VV || ioo Vt>0. (3.6) 

This follows e.g. by using the pointwise estimate of Bakry and Emery, refined by Bakry [2, 
Proposition 1], which when U is convex yields |VFf(/)| < lf(\Vf\). Together with the 
maximum principle, this indeed implies that: 



= |VPf(exp(-V ))(x)| Pf(|VV |exp(-V ))(x) 

Pf(exp(-V ))(x) " P t u (exp(-V ))(x) " 11 °" L ° 



Now applying formally the maximum principle to (3.5), using (3.6) and recalling the 
definition of Z, we obtain: 

\\AV t \\ Loo < \\AV \\ Loo +tn\\D 3 U\\ Loc \\DV Q \\ Loo . 

The assumption (3.1) ensures (in particular) that all terms above are bounded, and hence 
AVt is uniformly bounded on [0, T] and it seems that we are done. 

However, there is a technical issue here: to appeal to the maximum principle on the 
unbounded domain R n , we have to a-priori verify that AVt{x) does not grow spatially faster 
than exp(C|x| 2 ) for some C > 0, uniformly in t € [0, T] (see e.g. [21, Chapter 2, Theorem 
9]). The rest of the proof is dedicated to verifying this a-priori growth rate. 

First, observe that V grows spatially at most linearly, uniformly in i G [0, T]. To see 
this without eluding to compactness, denote by m the minimum of Vb, and hence (by the 
maximum principle) of V(-,t) for any t > 0. Fix C > and let r > be so that: 

exp(-C)n(B(r)) + exp(-m)(l - fi(B(r))) < 1 . 
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It follows since J exp(—V(x,t))dfj,(x) = 1 for any t > 0, that for any such t there ex- 
ists xo(t) E B(r) so that V(xo(t),t) < C. Consequently, (3.6) implies that V(x,t) < 
||W || Loo \x - x (t)\ + V(x (t),t) < || W || Loo (\x\ +r) + C. 
Now write (3.4) as: 

^-V - AV = h , V\ t=0 = V , 
at 

where —h = (VV, W) + (VV, VJ7). By the assumptions of Theorem 1.1, |VZ7| (x) grows at 
most linearly in \x\, and together with (3.6), it follows that h too grows spatially at most 
linearly. Consequently, applying an interior regularity estimate (e.g. applying the estimate 

[37, Chapter IV, (10.2)] for the Sobolev space Wp ' with p arbitrarily large, followed by a 
variant of Morrey's embedding theorem as in the Corollary after [37, Chapter IV, Theorem 
9.1]), it follows that: 

||T / || C (1+ Q ;(1+q)/2)( jB ( /? ) x [ 0iT ]) 

< C(n,T, a)(||/t||co(B(R')x[o,T]) + ll^b|lc 2 (B(fl')) + W V Wc°(B(R')x[o,t])) > 

for any a G (0, 1) and 1 < R < R' - 1. Since || VVoH^i+a^n^ is assumed bounded in (3.1), 
and as explained above, Vo, V and h grow spatially at most linearly, it follows that so does 

ll^ / llc( 1 + Q >( 1 + a )/ 2 ) (B(R)x[0,T])- 

Using this and arguing as above, we verify that ||^||c(«;<*/2)(B(m x [o Tl) g rows & t most 
quadratically in R. Applying the interior Schauder estimate again (e.g. [37, Chapter IV, 
Theorem 10.1]), it follows that: 

\\V \\c( 2 +°'-- 1 + a / 2 )(B(R)x[o,T]) 
< C(n, T, a)(\\h\\ C ( a , a /2)( B ( R ,) x [ 0)T fl + \\Vo\\c'^(b(R')) + W \\c°(b(R')x[o,t])) j 

for any 1 < R < Rf — 1. Using (3.1) again, we conclude that D 2 Vt a-priori spatially grows 
at most polynomially, uniformly in t G [0, T], thereby concluding the proof. □ 

We conclude from Lemma 3.1 that the maps St are well-defined (at least under the 
assumption (3.1)). Moreover, it follows that St are diffeomorphisms (e.g. [22, Theorem 
1.61]), since the inverse maps T^t = T% := S^ 1 may be obtained by running the flow 
backwards: 

^T t>T (x) = -W t - T (T t>T (x)) , T t , = Id , t G [0,t] . 

Clearly, the maps St and Tt inherit the symmetries of the vector field W% = —V log P/ / (exp(— V)). 
As explained in the proof of Theorem 2.2, — log P/ / (exp(— V)) is invariant under the common 
symmetries of U and V, i.e. our symmetry assumptions (1.2), and so its gradient commutes 
with the group 0(E\, . . . ,Ek) ; our maps therefore satisfy our symmetry assumptions as 
well. 

Theorem 1.2 guarantees that DWt > and hence (DWt)* + DWt > for every t > 0. 
Consequently: 

^(DS t )*(x)DS t (x) = (DSt)*(x)(DWt)*{Stx)DS t (x) + {DS t )*{x)DW t {S t x)DS t {x) > , 

and hence (DSt)*DSt > Id for every t > 0. In other words, St is locally an expansion. 
Since St is also a diffeomorphism, it follows that it is in fact an expansion globally. Indeed, 
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{DTt)*DTt < Id, which implies by integration and the triangle inequality that — 
Tt(y)\ <\x- y\. 

Next, we address the question of convergence of v t := P t c/ (exp(— V))fi to \i. Although we 
will only require convergence in L\ for the sequel, we state the following for completeness: 

Lemma 3.2. As t — > oo, we have: 

1. P/ / (exp(— V)) — > 1 in L p (fj<), for any p G [l,oo). 

2. P t c/ (exp(-V)) -)■ 1 in L 00 (C), for any compact set C C W 1 . 



3. 





dfi 


dx 


dx 



for any p G [1, oo] . 



For the proof, first recall that by (1.6), — L = — A+(V, VZ7) is a symmetric positive semi- 
definite operator on the subspace C°°(M n ) n ^(a*)) an d hence admits a Friedrichs extension 
to a self-adjoint positive semi-definite operator on a larger dense subspace V of £2 (/■*)) 
which we also denote by —L. Since U is convex and fj, = exp(—U(x))dx is a probability 
measure, it is known that — L has a strictly positive spectral-gap Ai > away from the 
trivial eigenvalue of 0, corresponding to the constant functions: f —fLfdfi > Ai J f 2 dfi for 
all / G Vq := {f G T>\ f fd/i = 0}. For instance, by [30] (see also [48]), one may estimate 
Ai > c(J \x\dn(x)y 2 > for some universal numeric constant c > 0. 

Proof of Lemma 3.2. Since Ai is strictly positive, the Spectral Theorem implies that P/ / (exp(— V)) 
exp(— iL)(exp(— V)) tends in £2^) to the projection of exp( — V) onto the constant func- 
tions, i.e. to the constant function 1 = f e'x.p(—V)d/j 1 . Since is bounded in (as in 
Subsection 2.1), we deduce the first claim for p G [2, 00) by interpolation (and by Jensen's 
inequality this extends to p G [l,oo)). Next, we follow an argument similar to that used by 
Ledoux [39]. Denoting / = exp(— V), write: 

|Pf - 1| = \P t U (f)(x) - J P t u (f)(y)d^y)\ < J |/f (/)(*) - P?(f){y)\Mv) ■ 

Certainly | Pj^ (f)(x)—P[ J (f)(y) \ < \V P^ (f)(z)\\x— y\ for some intermediate point z G [x,y]. 
But using that U is convex, the following smoothing estimate is known ([40]): 

ivpf (/)(*)| < 



and so: 



\P t U (f)(x) - 1| < -L \\f\\ Lx (\ X \ + J \y\dn(y)j 



The uniform convergence (as t — > 00) on compact subsets follows. Moreover, since |x| exp(—U(x)) 
is necessarily bounded, we obtain the third claim for p = 00. The third claim for p = 1 is 
equivalent to the first one with p = 1, and so by interpolation, the third claim follows for 
all p G [l,oo]. □ 
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Recall that a sequence of Borel measures {r] k } is said to converge to a Borel measure 
77 weakly (or in the weak*-topology) if j ipdr] k — > f fdrj for any bounded continuous test 
function cp; we will denote this by rj k — 1 77. We define the Li distance between two absolutely 
continuous Borel measures 771,7/2 on M. n to be: 



^(771,772) 



/ 




d??2 






dx 



dx 



this coincides with the usual total- variation distance up to a factor of 2. Clearly, convergence 
in Li implies weak convergence. 

Lemma 3.3. Let {/Jfc} and {vk} denote two sequences of absolutely continuous Borel mea- 
sures on W 1 , such that each v k is the push-forward of fj, k by a contracting map T k : W 1 — > M. n . 
Assume that d^&ki A*) ~~ ^ an ^ ^fc — v v - Then there exists a contraction T : W 1 — > M n 
pushing forward \x onto v. Moreover, any common symmetries possessed by T k are preserved 
byT. " 

Proof. First, note that Tfc(O) must be uniformly bounded. Indeed, let B(R\) denote a ball 
around the origin so that fi(B(Ri)) > 3/4. The L\ convergence immediately implies that 
li k (B{Rx)) -> n(B(Rx)), and so > 2/3 for large enough k. Similarly, if B{R 2 ) 

denotes a ball so that v{B(R2)) > 3/4, it follows easily from the weak convergence that 
Uk(B(0, R2)) —> v(B(R2)) (here we need to use the fact that the ball has finite perimeter and 
that our measures are absolutely continuous with respect to Lebesgue measure), and hence 
v k (B(R 2 )) > 2/3 for large enough k. Consequently, for large enoug h k, l x k {T-\B{R 2 ))) = 
Uk(B(R2)) > 2/3, and therefore T^ l (B(R2))C\B{Ri) is non-empty. Since T k is a contraction, 
it follows that T k (0) €B(R 1 + R 2 ). 

Next, by passing to a subsequence if necessary, we may assume that T k (0) converges. 
Since T k are all contractions, and hence uniformly (Lipschitz) continuous, it follows by 
compactness and a standard diagonalization argument that, after passing to an appropriate 
subsequence, T k uniformly converges on compact subsets of M. n to some map T, which is 
consequently a contraction, which preserves the common symmetries of T k . It remains to 
show that T pushes forward [i onto v. 

This is equivalent to showing that J ip(Tx)dfj,(x) = j if(y)du(y) for any bounded con- 
tinuous test function ip : W 1 — > R. Since by definition, for any k: 



ip(T k x)d/j, k (x) 



y{y)dv k {y) 



and the right hand side converges to f <p{y)dv(y), it remains to show that the left hand side 
converges to j (p(Tx)dfj>(x). Indeed: 



ip(T k x)dfj, k (x) - J ip(Tx)dfj,(x] 

<p(T k x)dn k (x) - / <p(T k x)dfi(x) 



< 



+ 



(p(T k x)dfj,(x) 



J ip(Tx)dfi(a 



The first term on the right hand side converges to since (p is bounded and d^ (fi k , jj) — > 0. 
The second term converges to by Lebesgue's dominant convergence theorem, since (the 
bounded) (p(T k x) pointwise converges to <p(Tx) (in fact uniformly on compact subsets). 
This concludes the proof. □ 
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Lemma 3.2 (case (3) with p = 1) ensures that v% converges to /i in L±. Since v is the 
push-forward of v% via Tf which is a contraction, it follows by Lemma 3.3 that there exists a 
contraction pushing forward \x onto v and satisfying our symmetry assumptions. This 
concludes the proof of Theorem 1 . 1 in the case that U and V are assumed smooth and under 
the additional assumptions of (3.1). To conclude the theorem in the full generality, apply 
Lemma 3.3 again to see that there exists a contraction pushing forward \i onto u, whenever 
these measures may be approximated by smooth measures satisfying the assumptions of 
the theorem and (3.1). Such approximation is always possible by a standard argument: 
applying the Legendre transform to V, redefining the resulting function to be +oo beyond 
some large level, and applying the transform again, we obtain a convex Lipschitz function 
with the same symmetries, and it remains to convolve it with a smooth rotation-invariant 
mollifier, yielding the first part of (3.1) ; a similar argument applies to the function U, 
whose special form (1.3) reduces the approximation to an easy one-dimensional problem. 
Lemma 3.3 thus implies the general case of Theorem 1.1. 

4 Applications 

The first application we would like to describe pertains to a generalization of the Gaussian 
Correlation Conjecture. This conjecture asks whether for any two convex subsets A, B C W 1 , 
which are in addition centrally-symmetric (C is called centrally-symmetric if C = — C), the 
following inequality is valid for the standard Gaussian measure j n on R n : 

ln (AC\B)> ln (A) ln (B)l (4.1) 

We refer to [50, 25, 17] and the references therein for the history of this conjecture, which 
remains open for n > 3. One of the most general results is due to Harge [25], who confirmed 
the validity of (4.1) when one of the sets is a (centrally-symmetric) ellipsoid. This was 
subsequently given a different proof by Cordero-Erausquin [17], as a direct corollary of 
Caffarelli's Contraction Theorem (in this context, it is worthwhile pointing out that our 
construction of the expanding map T _1 closely resembles Harge's argument). Replacing 
Caffarelli's theorem with Theorem 1.1 in Cordero-Erausquin's argument, we obtain the 
following generalization: 

Corollary 4.1. Let \i = exp(—U(x))dx denote a probability measure on R n as in Theorem 
1.1, which is in addition centrally symmetric (i.e. the quadratic part ofU on Eq is assumed 
even). Let B denote a centrally- symmetric convex subset of W 1 satisfying the following 
symmetry assumptions: 

3C B C R dimE o+ k 1 B ( X ) = l CB (Proj Eo x, \Proj El x\, \Proj Ek x\) . 

Let A denote a centrally-symmetric subset ofM n so that, writing for x £ W 1 , x = (xq,xi, . . . , Xf.) 
with Xi £ Ei, we have: 

if (xq, xi, . . . , Xk) £ A then 

Vyo G Eq, \\y \\ £ < \\x \\ e , Vij G [-1,1], we have (y ,*i£i, ■ ■ ■ ,t k x k ) e A , (4.2) 

where \\-\\ £ is the norm associated with some centrally-symmetric ellipsoid £ C Eq. Then: 

H(ADB) > jj,(A)n(B) . 
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Clearly, this generalizes the result of Harge and Cordero-Erausquin, by choosing fj, = j n 
and Eq = M. n . 

Proof. First, by applying an appropriate linear transformation P in Eq which leaves the 
orthogonal complement invariant, we may assume that £ is a Euclidean ball in Eq, since 
P(B) and P*(n) continue to satisfy the assumptions of the theorem (indeed, P only affects 
the quadratic part of U, which remains quadratic and even). Defining the probability 
measure /i^ as the restriction of /i onto B, i.e. /ie(C) = ji{C n B)/fi(B), our task is to 
show that ^b(A) > fJ>(A). It is standard to approximate 1b/^(B) in Li(K n ) by functions 
of the form exp(— where is convex and satisfies the same symmetries as B, implying 
that exp(— Vjfc)/U tends to \xb in total-variation. Applying Theorem 1.1 and Lemma 3.3, 
we deduce that there exists a contraction T pushing forward \x onto and satisfying 
our symmetry assumptions. Since T commutes with 0(E\, ... ,Ef,), it follows easily that 
ProjEiT{x) is radial for i = 1, . . . , k: 

ProjE i T{x) = Ti(x , \x±\, . . . , \xk\)-r—r if Xi 7^ and otherwise . (4.3) 

\Xi\ 

Moreover, since B and fi were assumed centrally-symmetric, it is easy to check that T will 
also preserve this additional symmetry. Denoting by Ri the reflection in the subspace Ei, 
i.e. Ri(x) = x — 2ProjEiX for i = 0,1, ... ,k, we conclude that T commutes with all the 
Ri's. 

It remains to note that T(A) C A. Indeed, using the commutation with Ri and the 
contraction property of T, we have: 

2\Proj Ei T(x)\ = \Ri(T(x)) - T(x)\ = \T(Ri(x)) - T(x)\ < \Ri{x) -x\= 2\Proj E .x\ , 

and so \ProjEiT(x)\ < \ProjEi%\ for i = 0, 1, . . . , k. Together with (4.3) and the symmetries 
(4.2) of A, it follows that T(A) C A. Consequently A C T~ 1 (A), and therefore: 

^^ = ^{A)=^{T-\A))>^A). 

The proof is complete. □ 

Remark 4.2. It is possible to replace the requirement t{ S [ — 1, 1] in (4.2) by t% E [0,1]. 
This is achieved by using the Brenier map T opt of Theorem 1.3 instead of T in the proof 
above, thereby ensuring that the {Tj}*L 1 in (4.3) are always non-negative, as explained in 
Section 5. 

The following two additional corollaries may be easily obtained from the previous one 
by integration by parts: 

Corollary 4.3. Let \x denote a probability measure on W 1 as in Theorem 1.1, which is in 
addition centrally symmetric. Let f, g : R n — >■ denote two measurable bounded functions, 
so that for each a, b > 0, the level sets f~ 1 ([a, oo)) and <7 _1 ([6, oo)) satisfy the assumptions 
on the sets A and B in Corollary 4-1, respectively. Then: 

fgdu > / fdfjL I gdjjL . 
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Corollary 4.4. Let p, v denote two probability measures as in Theorem 1.1, and assume in 
addition that both are centrally symmetric. Let V : R n — > R+ denote a measurable function 
such that all of its level sets r _1 ([0,a]) (individually) satisfy the assumption on the set A 
in Corollary 1^.1. Then: 



These corollaries generalize the correlation inequalities obtained in [9, 25, 14] for the case 
dimEo = n. We remark that when dimEo = 0, the corollaries may be obtained directly 
without appealing to Theorem 1.1, so the more interesting case is when < dimEo < n. 

Finally, we also mention that contracting maps constitute a very useful tool to transfer 
isoperimetric inequalities from one measure-metric space to another. Note that the measure 
p of Theorem 1.1 is a product measure, with each factor being either a Gaussian or a log- 
concave radially symmetric measure. The isoperimetric inequality satisfied by the former 
factor is well known [51, 8], and has recently been identified (up to numeric constants) for 
the latter factor [27]. The tools to transfer these inequalities to the product measure have 
also recently been obtained [3, 4, 5, 47], and so consequently, the isoperimetric inequality 
satisfied by p is well understood. Using the contracting map T of Theorem 1.1, it follows 
that the same isoperimetric inequality is satisfied by the measure v. We refer to [38] for 
further examples of using contracting maps to transfer isoperimetric inequalities, and for 
further information. 

5 Caffarelli's proof revisited 

Let us now sketch the proof of Theorem 1.3, which is based on the proof of [14, Theorem 
11], but requires an additional ingredient from [14] in the form of Theorem 5.1 below. 
Throughout this section we use T to denote the Brenier optimal-map. 

5.1 The Radial Case 

We begin with the elementary case when p = exp(— p(\x\))dx and v = exp(— (p + v)(\x\))dx 
are radial. This case does not require the use of Theorem 5.1, and as we will see, clearly 
motivates the condition p'" < in Theorem 1.1. 

First, it is immediate to reduce to the one dimensional case, when p and v are sup- 
ported on R+. Indeed, by the radial symmetry and the uniqueness of the Brenier map 
T = V<p with if : W 1 — )■ R a convex function, it follows that T must also be radi- 
ally symmetric, i.e. commute with the orthogonal group. Consequently, we may write 
ip(x) = <j)(\x\) with 4> : R + -> R convex, and T(r0) = Ti(r)6 for 6 G S n ~ l and r G R+. 
T\ = 4>' : R + —7- R + is precisely the Brenier map pushing forward exp(— p{r))r n ~ l dr onto 
exp(— (p(r) + v{r)))r n ~ l dr. Denoting pi(r) = p(r) — (n — l)logr, we see that p\ remains 
convex and p'" < 0, and so it is enough to show that when in addition v : R+ — > R is 
convex and non-decreasing, the Brenier map T\ pushing forward p\ = exp(— pi {r))dr onto 
v\ = exp(— (pi(r) + v(r)))dr is a contraction. 

Indeed, in the one dimensional case, the derivative of a convex function is simply a 
monotone non-decreasing one, and so the Brenier map is the unique non-decreasing map 
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pushing forward pi onto v±, given by: 



T t {x) 
exp 



rx 

( — (p(r)+v(r)))dr = / exp(— p(r))dr . (5.1) 
Jo 



/o Jo 
Since p, v are assumed smooth enough, so is T\. Taking derivatives, we obtain: 

\ogT[{x) = -p{x) + p{T 1 {x))+v{T 1 {x)) . (5.2) 

Assume that the maximum of T[ is attained at xo £ R+. To ensure this, one would actually 
need to restrict v\ onto a compact subset, in which case lirn r _ i , 00 T[(x) = and so the 
(positive) maximum is attained, and conclude with an approximation argument (as in [14]) 
; we omit the details here. Our task is to show that T[(xq) < 1. If xq = 0, since Ti(0) = 
and exp(— v (0)) > 1 (otherwise p and v could not both have total mass 1), it follows that 
^i(O) < 1) as required. Otherwise, denoting F = logT{, since F and T{ have a local 
maximum at xq, it follows that T['(xq) = and that: 

> F"(x ) 

-p'{(x ) + (T 1 ( 3 ;o)) 2 (p / 1 '(r 1 (xo)) + v"(T 1 (x ))) + r i \x )( f / 1 (T 1 (x )) +v'(T 1 (x Q ))) 
= -p'((x ) + (T^o)) Vi'm^o)) + ^m(xo))) . 

Since v" > and p" > 0, we obtain that: 

(^(xo)) 2 < 4^ • ^ 5 ' 3 ) 

/wiN)) 

In Caffarelli's argument, p\ is a quadratic polynomial, and therefore the right-hand side 
above is identically 1. However, since T\{x) < x for all x € M + , as easily verified from 
(5.1) and the fact that v is non-decreasing, we obtain by the mean-value theorem that the 
right-hand side is not greater than 1 as soon as p'(' < 0. This concludes the proof and 
explains the latter condition. 

We remark that in this simple case, the Brenier map and the map we construct in our 
proof of Theorem 1.1 do in fact coincide, since the latter one is also radially symmetric, 
and is constructed as a limit of diffeomorphisms, and hence must be monotone on each ray 
from the origin. 



5.2 The General Case 

Let p = exp(— U (x))dx and v = exp(— (U(x) + V(x)))dx be two probability measures in M n , 
satisfying the assumptions in Theorem 1.1. We will actually assume that v is supported 
on a compact convex set C, to be specified later on, and that U E C 3 ' Q (]R n ), U is strictly 
convex, and V € C 3 ' Q (C) ; the general case follows by a standard approximation argument, 
under which one may show that the corresponding Brenier maps converge to the gradient 
of a convex function, i.e. the Brenier map for the limiting measures, and the contraction 
property is trivially preserved in the limit. 

Let T = V(p denote the Brenier map pushing forward p onto u, where (p : W 1 — > R 
is a convex potential. It follows from our assumptions and Caffarelli's regularity theory 
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[12, 11, 13] that if £ Cf '"(R n ). It also follows from the proof of [14, Lemma 4] and 
the subsequent remark that H-DTH (x) = max^ gS n-i D 2 ^ip(x) attains a maximum in W 1 , 
since D 2 ^ip(x) tends to as \x\ — > oo uniformly in £ G S"™ -1 , when C is convex. We 
will denote by xq a point where this maximum is attained. Our task is to show that 
1 1 -^11 Lip := ^e,e c / 9 ( x o) < 1) where e E is the eigenvector of D 2 ip(xo) corresponding to 

its maximal eigenvalue, and hence: 

D e D^(x ) = DlMx )e . (5.4) 

As usual, attaining the maximum at xq implies that: 

VDlMxo) = Dl e T(x ) = , D 2 Dl e y{x Q ) = Dl e DT(x ) < . (5.5) 

As in (5.2), the change-of- variables formula resulting from the definition of push-forward 

is: 

logdet DT{x) = -U(x) + U(T(x)) + V(T{x)) . (5.6) 
Differentiating (5.6) twice in the direction of e, we obtain: 

-tr{{DT)- l {x)D e DT(x){DT)- l {x)D e DT(x)) + tr({DTy 1 {x)D 2 ee DT{x)) (5.7) 
= -Dl e U(x) + (D 2 (U + V)(T(x))D e T(x), D e T(x)) + (D(U + V)(T(x)), Dj e T(x)> . 

Using that DT = D 2 (p > 0, observe that D e DT{DT)- l D e DT > 0. Recalling by (5.5) 
that D 2 e e DT(x Q ) < 0, and using the fact that tr(AB) > if A, B > 0, it follows that the 
left-hand side of (5.7) is non-positive when evaluated at xq- Noting by (5.5) that the last 
summand on the right-hand side of (5.7) vanishes at this point, and using D 2 V > and 
(5.4), we conclude that: 

Dl e U(x ) > (D 2 U(T(x ))D e Dip(x ),D e D V (xo)) = Dl e U(T(x ))\DlMxo)\ 2 • 
Since D 2 U > 0, we obtain the analogue of (5.3): 

IITII2 ln2 / M2^ D e,eU(xo) 

W T Wu P = \D e M-o)\ < D 2 eUiT{xQ)) • 

When U is quadratic, this is already enough to guarantee that T is contracting. To make 
sure that the right-hand side is not greater than 1 under more general circumstances, we 
would need by the mean- value theorem to ensure that: 

(£> 3 l7)| 1 ,(e,e,xo-r(so))<0 Vy G [xq, T(xq)] ■ (5.8) 

By the uniqueness of the Brenier map and the symmetries of \x and v, we know that T 
must satisfy our symmetry assumptions. Consequently, as in the proof of Corollary 4.1, T 
must act radially on each Ei, i = 1, . . . , k: 

Proj E T(x) = l Ti ( Pr °3 E ° x > \ Pr °jE 1 x\,. . . , \Proj Eh x\) if Proj Ei x ^ 0, 

I otherwise. 
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As the gradient of a convex function, we must have {T(x) — T(y),x — y) > for all x,y G 
]R n , and using y = x — 2ProjEiX (reflecting x in Ei about the origin) implies that necessarily 
Ti > 0. Consequently: 

Vi = 1, . . . , k 3ai(x ) > ProjEiT(x ) = ai(x )Proj Ei xo ■ 

We conclude from Lemma 2.4 that (5.8) would follow if we could show that: 

Vx G M. n Vt = 1, . . . , k ai{x) < 1 . (5.9) 

Geometrically, this means we that we have reduced the task of showing that T is a contrac- 
tion, to showing that T is a contraction with respect to the origin on each Ei. Note that in 
the radial case, this followed trivially from the monotonicity of v. 

To show (5.9), we require the following additional ingredient [14, Theorem 6]. 

Theorem 5.1 (Caffarelli). LetU x G C 1 '^^) andU 2 G C^ a (n 2 ), whereVL 2 = x" =1 [a;,&;] C 
M n and Oi D Q, 2 , so that f n exp(—Ui(x))dx = 1. Let T denote the Brenier optimal- 
transport map pushing forward exp(— U\(x))dx onto exp(—U 2 (x))dx, and let S denote a 
fixed subset of the coordinates {1, . . . , n}. Assume that for any x G Q\, y G fl 2 and j G S: 

Vi G S yi < Xi and Xj = y» => - — U\(x) < — — U 2 (y) . (5.10) 

dxj dyj 

Then T(x)i < Xj for all i G S, for any x G 

In our formulation, we have exchanged between source and target measures (using that 
the Brenier map in this case is precisely the inverse of the original one), removed the 
assumption that = £l 2 , and consider only a subset of the coordinates for which the 
assumption and conclusion hold (as can be easily verified by inspecting the proof). 

Fix a coordinate structure determined by our decomposition of M n into Ei, let Q denote 
the set of coordinates corresponding to Eq, and let S denote the set of all other coordinates, 
corresponding to the subspaces Ei,...,E k . Set C = [— R,R] n , fii = x R^_, £l 2 = 
[— R, Rp x [0, R] s , U\ = U + c\ and U 2 = U + V + c 2 , where Cj are constants designed 
to make exp(— Ui(x))dx probability measures on Qj. The symmetries of T described above 
imply that it is enough to verify (5.9) for x G fl± and that T\q 1 = T, where T is given by 
Theorem 5.1. Consequently, the desired (5.9) will follow from the conclusion of Theorem 
5.1 if we verify (5.10). 

Fix j G S, corresponding to a subspace E\. Lemma 2.5 implies that -^pV{y) > for 
any y G £l 2 , and so it is enough to verify that for iSfij and y G f^2 : 

Vi G S yi < Xi and Xi = Vj — — U(x) < - — U(y) . (5.11) 

dxj dyj 

But ^~U(x) = Pl ^pJoj^ l x \ ^ x j ^ an< ^ when Xj is fixed, the coefficient in front of it is non- 
increasing in |Proj_E ; x| since p\ was assumed concave and p'i(0) = 0. Since x G fii and 
y G O2, the assumption yi < X{ for alH G S implies that \ProjEil/\ < \PfojEiX\, confirming 
the desired (5.11). This finally concludes the proof. 
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6 Comparing the two maps 



In this section, we compare the map T (as constructed in Subsection 1.2) with the Brenier 
map T opt . 

First, it is natural to ask whether the two maps T and T op t coincide, at least under the 
assumptions of Theorem 1.1. To analyze this question, recall that St was constructed as 
follows: 

£s t (x) = Wt(St(x)) , S = Id, with W t := VZ t , Z t := - log P? (exp(-V)) . (6.1) 
Denoting B t (x) := D 2 Z t (St(x)) and taking spatial derivatives, we obtain: 

j t DS t {x) = B t (x)DS t (x) , DSq(x) = Id . (6.2) 

As is well known, a necessary and sufficient condition for being the gradient of a function 
on a simply connected domain, is having a symmetric derivative tensor. It follows that: 

if all of {Bt} t>0 commute with each other , (6-3) 

ensuring that DSt remains symmetric along the flow, then we can conclude that St is the 
gradient of some function (for each t). Moreover, we could then write: 

DS t (x) = exp (^j B s {x)ds S j , 

from which it would follow that DSt is pointwise positive semi-definite, and hence St must 
be the gradient of a convex function. The inverse map Tt = S^ 1 would then be the gradient 
of a convex function as well, and this property may be shown to be preserved in the limit 
as t — > oo, obtaining the Brenier map transporting /i = (5' 00 )*(^) onto v. 

Condition (6.3) implies that in all one-dimensional situations (n = 1 or radially symmet- 
ric data), both maps T and T op t do coincide. However, it is easy to check that generically, 
the sufficient condition (6.3) will be severely violated, for instance by constructing examples 
(see below) so that for some x: 

^ l^-B t (x), B t {x)) = [D 2 ^-Z t + D 3 Z t DZ u D 2 Z t ]{S t (x)) Vi > , (6.4) 
at at 

where [A, B] = AB — BA denotes the Lie bracket. Moreover, it is not hard to show that 
(6.4) implies that DSt(x) is non-symmetric on some non-empty interval t € (0, to), and that 
for any non-empty interval (ii,t2) C (0, oo), {DSt(x)} te r ti t ^ cannot all commute. In other 
words, the path of diffeomorphisms [0, oo) 3 t i— )• St will generically not coincide with the 
path of optimal interpolating maps [0, 1) B s i-> (1 — s)Id + sS op t, where S op t = T~ p \ is the 
Brenier map pushing forward v onto /i, and in fact the set of times t where these two paths 
intersect will be discrete. 

All of this suggests that generically, the lack of symmetry (or path separation) should 
persist in the limit as t — > oo, and hence that the limiting map T should be different from 
T^t. However, although we believe that (6.3) is actually also a necessary condition (at least 
generically) for obtaining the Brenier map, we are unable to rule out the possibility that 
the symmetry may be recovered in the limit. In particular, we are unable to show that the 
two maps are different even for the following simple example, where (almost) everything 
may be explicitly computed: 
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Example 6.1. Let U,V be given by: 

U(x) = -(Ax,x) , V(x) = -(Bx,x) , 

A, B are positive-definite non- commuting matrices , 

and set: 

fi = ci exp(— U(x))dx , v = C2 exp(— (U (x) + V{x)))dx , 

with {cj} chosen so that the resulting measures have total mass 1. 

It is easy to see (e.g. [46, Example 1.7]) that T opt is a linear map given by the positive- 
definite matrix C opt = A 1 / 2 {A l / 2 {A + B)A l / 2 )~ 1 / 2 A 1 / 2 . The Mehler formula [25] for an 
affine Ornstein-Uhlenbeck diffusion implies that the tensor D 2 Zt = — D 2 log P^(exp(— V)) 
is an explicitly computable fixed matrix Mt for every time t > 0, and so by (6.2), the flow 
maps {St} are also linear, given by a family of matrices {Lt}. Moreover, Lt satisfy an 
explicit matrix- valued ODE, and one may also show that L\ (A + Mt)Lt = A + B. The 
resulting map T is then the linear map given by the matrix L^, where = lim^oo Lt. 

Showing that T ^ T op t when A, B do not commute then amounts to proving that Lqq 
is not symmetric in this case; we were unable to verify this. When A, B do commute, then 
so do all the matrices {Mt}, so (6.3) is satisfied and T = T op t. 

An additional aspect of comparing between T and T op t pertains to the condition that 
U be convex in Theorems 1.1 and 1.3. This condition was absolutely crucial in Caffarelli's 
argument and the proof of Theorem 1.3. However, an inspection of the proof of Theorem 
1.1 reveals that this condition was only used in the proof of (3.5), (3.6) and Lemma 3.2, 
and it is actually possible to relax our condition to D 2 U > —eld by a careful adaptation 
of the arguments (and in particular, avoid using the Spectral Theorem, since — L will no 
longer have a spectral gap). Unfortunately, the convexity of U actually follows from the 
other assumptions of Theorem 1.1, namely that \x = exp(—U(x))dx has finite total mass 
and that p'-' < on M + , so ultimately there is no real gain here in using T over T op t- But 
this difference in the significance of the convexity of U to the proof, perhaps reinforces the 
intuition that these two maps should be (generically) different. 

Before concluding, we mention a couple of advantages of working with the map T over 
the Brenier map T op t . In the proof of the contraction property of T op t (Theorem 1.3), Caf- 
farelli's regularity theory for the fully-nonlinear Monge-Ampere equation was an essential 
ingredient. In contrast, in our study of the map T (Theorem 1.1), we only employed the 
classical regularity results for linear parabolic PDEs. This lends our heat-diffusion construc- 
tion to further generalizations, in situations where the regularity for the Monge-Ampere 
equation and the Brenier-McCann optimal-transport map has yet to be established, or al- 
ternatively is known to be false, for instance in the Riemannian-manifold setting (see [54]). 
In addition, other choices for the driving potential Zt in our flow scheme (6.1) are also 
possible, in accordance to the property one wished to establish. 
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